This is an R Markdown document that will show you some of the basics of using the R language for data querying, data analysis, and we’ll even get into some basics of graphics in R.
R is a data querying language that was created by Bell Labs, and currently maintained and curated by the R Core Team at the University of Auckland in New Zealand.
Comprehensive R Archived Network (CRAN)
A system of “mirrors”
Closest CRAN mirror to UNC Charlotte: Duke University
The first time you install R, you will install the Base package.
Has most of what you need to run standard statistics and most data querying procedures.
Other packages can be downloaded for more specific or specialized tasks:
New versions of R are released every couple of months.
Names of R Versions are fun to watch.
But where do the names come from?
Sock It To Me
Frisbee Sailing
Full of Ingredients
Single Candle (And, You Stupid Darkness)
To get the most up-to-date version, just uninstall the version you have and download the new one.
When working with R, you’ll probably want to use RStudio, which is an IDE, or “integrated development environment.”
R works by writing code and executing the code.
RStudio just helps you to write that code and execute it properly.
It also lets you easily adjust things like color schemes, tile arrangement and so on.
Tools > Global Options > Pane Layout
Tools > Global Options > Appearance
The two main windows you need to be able to see?
Make sure when you open RStudio that you have those two windows.
You will write your commands in the Source window. Those commands form your R script. (input)
When you execute those commands, the results will be shown in the Console window. (output)
The other window options:
Files, Plots, Packages, Help, Viewer
Environment, History, Connections
R is an an object oriented language
SPSS, SAS, and Stata are procedural languages
Generally, object oriented languages operate by creating or defining objects and then executing functions on those objects.
Objects can be:
Most of the data sets you’ll work with will be data frames.
Objects have attributes:
Which functions you can execute on those objects depends on the class, and sometimes the length.
The name you give an object is completely arbitrary and totally up to you.
Words of advice when naming objects:
Okay. Let’s create a few objects.
First, a character string.
x <- "Hello, World!" # Doesn't generate output, just creates the object, "x""
print(x) # Executes the print() function on the object, "x"
## [1] "Hello, World!"
R as a big calculator
x <- 3
y <- 7
x+y
## [1] 10
Notice how I created an object called “x” again? Because R is dynamic, it replaced the old contents with the new contents automatically.
Attributes of Objects
class(y)
## [1] "numeric"
length(y)
## [1] 1
In R, you will want to write scripts that store the code you write.
The code is the “input” - tells the computer what to do
No need to save the output if you save the input.
In RStudio, you’ll write these in the “Source” pane.
You can save them as .R files.
You can also save them as unformatted text, like in a Google doc, or an Evernote file.
R can be used to open data stored in a variety of file types
Most common - the Comma Separated Value (.csv) file
You’ll use the read.csv() function to open these kinds of files.
This function simply reads the contents of a CSV file.
You’ll also need to create an object, and then pass the contents of the CSV file to that object.
# data <- read.csv(file.choose(),header=TRUE)
I used the hashtag operator at the beginningg of the line to stop it from running, but that’s only because you can’t use the file.choose() option in an R Markdown file.
You will want to not use that comment operator.
The file.choose() option just tells R to open a file explorer window where you can manually specify which file you want to open. The header=TRUE option tells R that the first row of the data file is not actual data, but rather is a header containing the names of the variables.
If you don’t use the file.choose() option, you’ll have to specify the file path, which you can do, but can be cumbersome.
In this file, you’ll see me use specific file names, because I can’t use the file.choose() option in an R Markdown file. But that’s okay, don’t worry about it. When you’re running RStudio normally, you can use it all you like.
Anywho, go ahead and open the mexico.csv data file I sent you. Let’s create an object called “mexico”. The code will look like this, except without the “#” at the beginning.
# mexico <- read.csv(file.choose(),header=TRUE)
# Below, I'm going to actually open the file for me to use in the Markdown file. You should use the code above, not below, to open the data on your computer.
mexico <- read.csv("reforma_ejecutometro.csv",header=TRUE)
Having opened some data, we need to make sure things worked the way we wanted to.
Check the object to make sure the data you open are the data you intended to open.
class(mexico) # Asks for the class of the "mexico" object
## [1] "data.frame"
str(mexico) # Asks for the structure of the object
## 'data.frame': 59 obs. of 40 variables:
## $ Period : int 1 2 3 4 5 6 7 8 9 10 ...
## $ Year : int 2008 2008 2008 2008 2008 2008 2008 2008 2008 2008 ...
## $ Month : chr "January" "February" "March" "April" ...
## $ Mexico : int 248 245 295 290 474 329 549 524 497 554 ...
## $ Border : int 101 129 171 114 227 158 233 292 220 362 ...
## $ Nonborder : int 147 116 124 176 247 171 316 232 277 192 ...
## $ Heroin : int 125 125 185 135 342 211 384 351 292 280 ...
## $ GoldenTriangle : int 65 100 157 93 285 168 331 309 221 256 ...
## $ Aguascalientes : int 0 2 0 2 3 2 6 10 6 1 ...
## $ BajaCalifornia : int 33 23 25 43 9 28 18 18 61 110 ...
## $ BCSur : int 0 0 0 0 0 0 0 0 0 0 ...
## $ Campeche : int 0 0 0 0 0 0 3 0 0 0 ...
## $ Coahuila : int 6 10 7 3 3 14 2 1 3 3 ...
## $ Colima : int 0 0 0 0 0 0 0 1 1 1 ...
## $ Chiapas : int 3 1 0 6 1 3 5 3 0 0 ...
## $ Chihuahua : int 41 66 117 49 175 100 181 243 133 182 ...
## $ DistritoFederal: int 11 6 1 18 20 5 29 15 9 14 ...
## $ Durango : int 11 21 11 10 37 19 18 19 24 21 ...
## $ Guanajuato : int 2 3 8 7 2 5 2 8 2 6 ...
## $ Guerrero : int 21 6 15 22 39 23 17 19 36 11 ...
## $ Hidalgo : int 5 1 3 3 2 3 1 5 7 1 ...
## $ Jalisco : int 6 10 14 10 12 8 16 15 11 20 ...
## $ Edomex : int 26 24 22 25 24 16 37 36 64 38 ...
## $ Michoacan : int 28 18 7 10 17 11 31 19 34 11 ...
## $ Morelos : int 0 0 0 0 8 6 4 0 2 3 ...
## $ Nayarit : int 0 0 0 1 1 0 0 1 0 0 ...
## $ NuevoLeon : int 3 14 6 13 7 0 3 3 4 18 ...
## $ Oaxaca : int 11 1 6 9 0 9 5 3 1 2 ...
## $ Puebla : int 0 0 0 0 1 2 1 2 3 1 ...
## $ Queretaro : int 0 1 0 1 0 0 3 2 0 0 ...
## $ QuintanaRoo : int 3 2 2 3 0 0 2 3 0 0 ...
## $ SanLuisPotosi : int 3 2 2 4 0 3 2 4 0 1 ...
## $ Sinaloa : int 13 13 29 34 73 49 132 47 64 53 ...
## $ Sonora : int 5 8 4 3 20 10 25 20 12 18 ...
## $ Tabasco : int 1 0 1 2 1 3 1 1 2 5 ...
## $ Tamaulipas : int 13 8 12 3 13 6 4 7 7 31 ...
## $ Tlaxcala : int 0 0 0 0 0 0 1 0 0 0 ...
## $ Veracruz : int 2 2 3 2 3 1 0 7 6 2 ...
## $ Yucatan : int 1 3 0 0 0 0 0 12 0 1 ...
## $ Zacatecas : int 0 0 0 7 3 3 0 0 5 0 ...
summary(mexico) # Asks for a summary of the object
## Period Year Month Mexico
## Min. : 1.0 Min. :2008 Length:59 Min. : 245.0
## 1st Qu.:15.5 1st Qu.:2009 Class :character 1st Qu.: 540.5
## Median :30.0 Median :2010 Mode :character Median : 753.0
## Mean :30.0 Mean :2010 Mean : 763.5
## 3rd Qu.:44.5 3rd Qu.:2011 3rd Qu.: 964.5
## Max. :59.0 Max. :2012 Max. :1717.0
## Border Nonborder Heroin GoldenTriangle
## Min. :101.0 Min. : 116.0 Min. : 125.0 Min. : 65.0
## 1st Qu.:232.0 1st Qu.: 290.0 1st Qu.: 325.0 1st Qu.:246.0
## Median :320.0 Median : 436.0 Median : 393.0 Median :295.0
## Mean :333.9 Mean : 429.6 Mean : 433.8 Mean :323.5
## 3rd Qu.:403.0 3rd Qu.: 533.0 3rd Qu.: 521.0 3rd Qu.:396.0
## Max. :654.0 Max. :1065.0 Max. :1048.0 Max. :757.0
## Aguascalientes BajaCalifornia BCSur Campeche
## Min. : 0.000 Min. : 3.00 Min. :0.0000 Min. :0.0000
## 1st Qu.: 0.000 1st Qu.: 10.00 1st Qu.:0.0000 1st Qu.:0.0000
## Median : 2.000 Median : 18.00 Median :0.0000 Median :0.0000
## Mean : 2.271 Mean : 25.64 Mean :0.4068 Mean :0.2034
## 3rd Qu.: 4.000 3rd Qu.: 28.50 3rd Qu.:0.0000 3rd Qu.:0.0000
## Max. :10.000 Max. :151.00 Max. :4.0000 Max. :3.0000
## Coahuila Colima Chiapas Chihuahua
## Min. : 0.00 Min. : 0.000 Min. : 0.000 Min. : 36.0
## 1st Qu.: 6.50 1st Qu.: 0.000 1st Qu.: 0.000 1st Qu.:113.0
## Median : 20.00 Median : 4.000 Median : 2.000 Median :162.0
## Mean : 29.47 Mean : 6.169 Mean : 5.271 Mean :168.6
## 3rd Qu.: 50.50 3rd Qu.: 9.500 3rd Qu.: 3.500 3rd Qu.:205.0
## Max. :101.00 Max. :28.000 Max. :170.000 Max. :360.0
## DistritoFederal Durango Guanajuato Guerrero
## Min. : 1.00 Min. : 10.00 Min. : 0.00 Min. : 6.00
## 1st Qu.:11.00 1st Qu.: 27.00 1st Qu.: 2.00 1st Qu.: 39.50
## Median :15.00 Median : 45.00 Median : 4.00 Median : 66.00
## Mean :15.31 Mean : 53.36 Mean : 5.78 Mean : 74.46
## 3rd Qu.:19.00 3rd Qu.: 67.00 3rd Qu.: 8.00 3rd Qu.:101.50
## Max. :35.00 Max. :364.00 Max. :30.00 Max. :205.00
## Hidalgo Jalisco Edomex Michoacan
## Min. : 0.000 Min. : 5.0 Min. :16.00 Min. : 7.00
## 1st Qu.: 1.000 1st Qu.: 17.0 1st Qu.:25.50 1st Qu.:14.00
## Median : 2.000 Median : 42.0 Median :35.00 Median :19.00
## Mean : 2.898 Mean : 40.2 Mean :36.29 Mean :24.44
## 3rd Qu.: 4.000 3rd Qu.: 59.5 3rd Qu.:41.50 3rd Qu.:32.00
## Max. :12.000 Max. :108.0 Max. :89.00 Max. :72.00
## Morelos Nayarit NuevoLeon Oaxaca
## Min. :-3.00 Min. : 0.000 Min. : 0.00 Min. : 0.000
## 1st Qu.: 2.50 1st Qu.: 0.500 1st Qu.: 7.50 1st Qu.: 0.000
## Median : 9.00 Median : 4.000 Median : 48.00 Median : 1.000
## Mean :11.53 Mean : 8.492 Mean : 63.85 Mean : 2.881
## 3rd Qu.:16.00 3rd Qu.:12.000 3rd Qu.:112.50 3rd Qu.: 5.000
## Max. :52.00 Max. :56.000 Max. :257.00 Max. :16.000
## Puebla Queretaro QuintanaRoo SanLuisPotosi
## Min. :0.000 Min. : 0.000 Min. : 0.000 Min. : 0.000
## 1st Qu.:1.000 1st Qu.: 0.000 1st Qu.: 0.000 1st Qu.: 1.000
## Median :3.000 Median : 0.000 Median : 2.000 Median : 4.000
## Mean :3.102 Mean : 1.169 Mean : 2.932 Mean : 8.678
## 3rd Qu.:4.500 3rd Qu.: 1.500 3rd Qu.: 4.000 3rd Qu.:12.500
## Max. :9.000 Max. :11.000 Max. :19.000 Max. :47.000
## Sinaloa Sonora Tabasco Tamaulipas
## Min. : 13.0 Min. : 0.00 Min. : 0.000 Min. : 0.00
## 1st Qu.: 65.0 1st Qu.: 6.00 1st Qu.: 0.000 1st Qu.: 6.00
## Median : 98.0 Median :12.00 Median : 1.000 Median : 18.00
## Mean :101.6 Mean :14.03 Mean : 2.932 Mean : 32.34
## 3rd Qu.:125.0 3rd Qu.:18.00 3rd Qu.: 4.000 3rd Qu.: 46.00
## Max. :237.0 Max. :55.00 Max. :19.000 Max. :168.00
## Tlaxcala Veracruz Yucatan Zacatecas
## Min. :0.000 Min. : 0.00 Min. : 0.000 Min. : 0.000
## 1st Qu.:0.000 1st Qu.: 2.00 1st Qu.: 0.000 1st Qu.: 0.000
## Median :0.000 Median : 6.00 Median : 0.000 Median : 5.000
## Mean :0.339 Mean :11.44 Mean : 0.339 Mean : 7.119
## 3rd Qu.:0.000 3rd Qu.:14.50 3rd Qu.: 0.000 3rd Qu.:10.000
## Max. :5.000 Max. :72.00 Max. :12.000 Max. :33.000
You can also use the fix() function on the object to open up a spreadsheet of the file.
But be careful!
Variables are stored inside this data frame. You can reference the variables by name, using the $ operator. Here’s how it works.
Let’s say you want to ask for summary statistics on the drug-related homicide counts in Tamaulipas.
Inside the “mexico” data frame, there is a variable called “Tamaulipas”
We just need to ask R to show us some summary statistics on that variable.
We’ll use the summary() function to do so.
summary(mexico$Tamaulipas)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.00 6.00 18.00 32.34 46.00 168.00
Or, maybe you wanted to know the mean and standard deviation of the homicides in Quintana Roo (you know, where Cancun and Cozumel are.)
Good thing there are functions to help us do that.
mean(mexico$QuintanaRoo)
## [1] 2.932203
sd(mexico$QuintanaRoo)
## [1] 3.26351
Let’s say you don’t want to take the time to write out the data frame and varaible name every time.
Well, you can create another object, a single vector, that’s just a single variable by itself.
Let’s pull out the homicides for the six Mexican states on the US-Mexico border. Luckily, there’s a variable that already exists with that information, called “Border”.
border <- mexico$Border
We can do the same thing for the non-border states (i.e. the other 26 states).
nonborder <- mexico$Nonborder
Now, let’s look for the Pearson’s Correlation Coefficient between these two variables.
Pearson’s Correlation Coefficient = How predictable is one variable from another?
cor(border,nonborder)
## [1] 0.8262518
What if we want to make new variables based on the ones we already have?
Let’s make a categorical variable that bins the homicides in all of Mexico into ranges.
First, let’s look at how the homicides are distributed.
totalmex <- mexico$Mexico
summary(totalmex)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 245.0 540.5 753.0 763.5 964.5 1717.0
table(totalmex)
## totalmex
## 245 248 290 295 329 450 459 463 464 474 495 497 510 524 540 541
## 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
## 549 551 554 561 584 639 645 663 671 709 717 718 735 753 761 762
## 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
## 765 786 806 831 873 894 908 914 920 923 928 948 981 983 994 1045
## 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 1
## 1054 1068 1076 1123 1163 1286 1303 1314 1717
## 2 1 1 1 1 1 1 1 1
Our new variable will group together months with between 0 and 500 homicides, 501 to 1000 homicides, and 1001 homicides and above.
Ready for some sweet code action?
mexbin <- vector()
length(mexbin) <- length(totalmex)
mexbin[totalmex >= 0 & totalmex < 501] <- "1. 0-500"
mexbin[totalmex >= 501 & totalmex < 1001] <- "2. 501-1001"
mexbin[totalmex >= 1001] <- "3. 1001+"
table(mexbin)
## mexbin
## 1. 0-500 2. 501-1001 3. 1001+
## 12 36 11
class(mexbin)
## [1] "character"
You can see that the new mexbin vector is a character vector. Let’s make it a factor variable.
mexbin <- as.factor(mexbin)
class(mexbin)
## [1] "factor"
Let’s plot that variable, you know, for fun.
plot(mexbin)
While we’re on the plotting subject, let’s talk about graphics.
R gives you a ton of options to plot variables and to take full control of the options.
Let’s add a few things to the the plot we already made.
plot(mexbin, main = "Drug-Related Homicides in Mexico",
xlab = "Homicides Per Month",
ylab = "Number of Months",
col = "red")
Colors in R
Red is a fairly vanilla color.
How many colors do you think there are in R?
Use the colors() function to find out.
colors()
## [1] "white" "aliceblue" "antiquewhite"
## [4] "antiquewhite1" "antiquewhite2" "antiquewhite3"
## [7] "antiquewhite4" "aquamarine" "aquamarine1"
## [10] "aquamarine2" "aquamarine3" "aquamarine4"
## [13] "azure" "azure1" "azure2"
## [16] "azure3" "azure4" "beige"
## [19] "bisque" "bisque1" "bisque2"
## [22] "bisque3" "bisque4" "black"
## [25] "blanchedalmond" "blue" "blue1"
## [28] "blue2" "blue3" "blue4"
## [31] "blueviolet" "brown" "brown1"
## [34] "brown2" "brown3" "brown4"
## [37] "burlywood" "burlywood1" "burlywood2"
## [40] "burlywood3" "burlywood4" "cadetblue"
## [43] "cadetblue1" "cadetblue2" "cadetblue3"
## [46] "cadetblue4" "chartreuse" "chartreuse1"
## [49] "chartreuse2" "chartreuse3" "chartreuse4"
## [52] "chocolate" "chocolate1" "chocolate2"
## [55] "chocolate3" "chocolate4" "coral"
## [58] "coral1" "coral2" "coral3"
## [61] "coral4" "cornflowerblue" "cornsilk"
## [64] "cornsilk1" "cornsilk2" "cornsilk3"
## [67] "cornsilk4" "cyan" "cyan1"
## [70] "cyan2" "cyan3" "cyan4"
## [73] "darkblue" "darkcyan" "darkgoldenrod"
## [76] "darkgoldenrod1" "darkgoldenrod2" "darkgoldenrod3"
## [79] "darkgoldenrod4" "darkgray" "darkgreen"
## [82] "darkgrey" "darkkhaki" "darkmagenta"
## [85] "darkolivegreen" "darkolivegreen1" "darkolivegreen2"
## [88] "darkolivegreen3" "darkolivegreen4" "darkorange"
## [91] "darkorange1" "darkorange2" "darkorange3"
## [94] "darkorange4" "darkorchid" "darkorchid1"
## [97] "darkorchid2" "darkorchid3" "darkorchid4"
## [100] "darkred" "darksalmon" "darkseagreen"
## [103] "darkseagreen1" "darkseagreen2" "darkseagreen3"
## [106] "darkseagreen4" "darkslateblue" "darkslategray"
## [109] "darkslategray1" "darkslategray2" "darkslategray3"
## [112] "darkslategray4" "darkslategrey" "darkturquoise"
## [115] "darkviolet" "deeppink" "deeppink1"
## [118] "deeppink2" "deeppink3" "deeppink4"
## [121] "deepskyblue" "deepskyblue1" "deepskyblue2"
## [124] "deepskyblue3" "deepskyblue4" "dimgray"
## [127] "dimgrey" "dodgerblue" "dodgerblue1"
## [130] "dodgerblue2" "dodgerblue3" "dodgerblue4"
## [133] "firebrick" "firebrick1" "firebrick2"
## [136] "firebrick3" "firebrick4" "floralwhite"
## [139] "forestgreen" "gainsboro" "ghostwhite"
## [142] "gold" "gold1" "gold2"
## [145] "gold3" "gold4" "goldenrod"
## [148] "goldenrod1" "goldenrod2" "goldenrod3"
## [151] "goldenrod4" "gray" "gray0"
## [154] "gray1" "gray2" "gray3"
## [157] "gray4" "gray5" "gray6"
## [160] "gray7" "gray8" "gray9"
## [163] "gray10" "gray11" "gray12"
## [166] "gray13" "gray14" "gray15"
## [169] "gray16" "gray17" "gray18"
## [172] "gray19" "gray20" "gray21"
## [175] "gray22" "gray23" "gray24"
## [178] "gray25" "gray26" "gray27"
## [181] "gray28" "gray29" "gray30"
## [184] "gray31" "gray32" "gray33"
## [187] "gray34" "gray35" "gray36"
## [190] "gray37" "gray38" "gray39"
## [193] "gray40" "gray41" "gray42"
## [196] "gray43" "gray44" "gray45"
## [199] "gray46" "gray47" "gray48"
## [202] "gray49" "gray50" "gray51"
## [205] "gray52" "gray53" "gray54"
## [208] "gray55" "gray56" "gray57"
## [211] "gray58" "gray59" "gray60"
## [214] "gray61" "gray62" "gray63"
## [217] "gray64" "gray65" "gray66"
## [220] "gray67" "gray68" "gray69"
## [223] "gray70" "gray71" "gray72"
## [226] "gray73" "gray74" "gray75"
## [229] "gray76" "gray77" "gray78"
## [232] "gray79" "gray80" "gray81"
## [235] "gray82" "gray83" "gray84"
## [238] "gray85" "gray86" "gray87"
## [241] "gray88" "gray89" "gray90"
## [244] "gray91" "gray92" "gray93"
## [247] "gray94" "gray95" "gray96"
## [250] "gray97" "gray98" "gray99"
## [253] "gray100" "green" "green1"
## [256] "green2" "green3" "green4"
## [259] "greenyellow" "grey" "grey0"
## [262] "grey1" "grey2" "grey3"
## [265] "grey4" "grey5" "grey6"
## [268] "grey7" "grey8" "grey9"
## [271] "grey10" "grey11" "grey12"
## [274] "grey13" "grey14" "grey15"
## [277] "grey16" "grey17" "grey18"
## [280] "grey19" "grey20" "grey21"
## [283] "grey22" "grey23" "grey24"
## [286] "grey25" "grey26" "grey27"
## [289] "grey28" "grey29" "grey30"
## [292] "grey31" "grey32" "grey33"
## [295] "grey34" "grey35" "grey36"
## [298] "grey37" "grey38" "grey39"
## [301] "grey40" "grey41" "grey42"
## [304] "grey43" "grey44" "grey45"
## [307] "grey46" "grey47" "grey48"
## [310] "grey49" "grey50" "grey51"
## [313] "grey52" "grey53" "grey54"
## [316] "grey55" "grey56" "grey57"
## [319] "grey58" "grey59" "grey60"
## [322] "grey61" "grey62" "grey63"
## [325] "grey64" "grey65" "grey66"
## [328] "grey67" "grey68" "grey69"
## [331] "grey70" "grey71" "grey72"
## [334] "grey73" "grey74" "grey75"
## [337] "grey76" "grey77" "grey78"
## [340] "grey79" "grey80" "grey81"
## [343] "grey82" "grey83" "grey84"
## [346] "grey85" "grey86" "grey87"
## [349] "grey88" "grey89" "grey90"
## [352] "grey91" "grey92" "grey93"
## [355] "grey94" "grey95" "grey96"
## [358] "grey97" "grey98" "grey99"
## [361] "grey100" "honeydew" "honeydew1"
## [364] "honeydew2" "honeydew3" "honeydew4"
## [367] "hotpink" "hotpink1" "hotpink2"
## [370] "hotpink3" "hotpink4" "indianred"
## [373] "indianred1" "indianred2" "indianred3"
## [376] "indianred4" "ivory" "ivory1"
## [379] "ivory2" "ivory3" "ivory4"
## [382] "khaki" "khaki1" "khaki2"
## [385] "khaki3" "khaki4" "lavender"
## [388] "lavenderblush" "lavenderblush1" "lavenderblush2"
## [391] "lavenderblush3" "lavenderblush4" "lawngreen"
## [394] "lemonchiffon" "lemonchiffon1" "lemonchiffon2"
## [397] "lemonchiffon3" "lemonchiffon4" "lightblue"
## [400] "lightblue1" "lightblue2" "lightblue3"
## [403] "lightblue4" "lightcoral" "lightcyan"
## [406] "lightcyan1" "lightcyan2" "lightcyan3"
## [409] "lightcyan4" "lightgoldenrod" "lightgoldenrod1"
## [412] "lightgoldenrod2" "lightgoldenrod3" "lightgoldenrod4"
## [415] "lightgoldenrodyellow" "lightgray" "lightgreen"
## [418] "lightgrey" "lightpink" "lightpink1"
## [421] "lightpink2" "lightpink3" "lightpink4"
## [424] "lightsalmon" "lightsalmon1" "lightsalmon2"
## [427] "lightsalmon3" "lightsalmon4" "lightseagreen"
## [430] "lightskyblue" "lightskyblue1" "lightskyblue2"
## [433] "lightskyblue3" "lightskyblue4" "lightslateblue"
## [436] "lightslategray" "lightslategrey" "lightsteelblue"
## [439] "lightsteelblue1" "lightsteelblue2" "lightsteelblue3"
## [442] "lightsteelblue4" "lightyellow" "lightyellow1"
## [445] "lightyellow2" "lightyellow3" "lightyellow4"
## [448] "limegreen" "linen" "magenta"
## [451] "magenta1" "magenta2" "magenta3"
## [454] "magenta4" "maroon" "maroon1"
## [457] "maroon2" "maroon3" "maroon4"
## [460] "mediumaquamarine" "mediumblue" "mediumorchid"
## [463] "mediumorchid1" "mediumorchid2" "mediumorchid3"
## [466] "mediumorchid4" "mediumpurple" "mediumpurple1"
## [469] "mediumpurple2" "mediumpurple3" "mediumpurple4"
## [472] "mediumseagreen" "mediumslateblue" "mediumspringgreen"
## [475] "mediumturquoise" "mediumvioletred" "midnightblue"
## [478] "mintcream" "mistyrose" "mistyrose1"
## [481] "mistyrose2" "mistyrose3" "mistyrose4"
## [484] "moccasin" "navajowhite" "navajowhite1"
## [487] "navajowhite2" "navajowhite3" "navajowhite4"
## [490] "navy" "navyblue" "oldlace"
## [493] "olivedrab" "olivedrab1" "olivedrab2"
## [496] "olivedrab3" "olivedrab4" "orange"
## [499] "orange1" "orange2" "orange3"
## [502] "orange4" "orangered" "orangered1"
## [505] "orangered2" "orangered3" "orangered4"
## [508] "orchid" "orchid1" "orchid2"
## [511] "orchid3" "orchid4" "palegoldenrod"
## [514] "palegreen" "palegreen1" "palegreen2"
## [517] "palegreen3" "palegreen4" "paleturquoise"
## [520] "paleturquoise1" "paleturquoise2" "paleturquoise3"
## [523] "paleturquoise4" "palevioletred" "palevioletred1"
## [526] "palevioletred2" "palevioletred3" "palevioletred4"
## [529] "papayawhip" "peachpuff" "peachpuff1"
## [532] "peachpuff2" "peachpuff3" "peachpuff4"
## [535] "peru" "pink" "pink1"
## [538] "pink2" "pink3" "pink4"
## [541] "plum" "plum1" "plum2"
## [544] "plum3" "plum4" "powderblue"
## [547] "purple" "purple1" "purple2"
## [550] "purple3" "purple4" "red"
## [553] "red1" "red2" "red3"
## [556] "red4" "rosybrown" "rosybrown1"
## [559] "rosybrown2" "rosybrown3" "rosybrown4"
## [562] "royalblue" "royalblue1" "royalblue2"
## [565] "royalblue3" "royalblue4" "saddlebrown"
## [568] "salmon" "salmon1" "salmon2"
## [571] "salmon3" "salmon4" "sandybrown"
## [574] "seagreen" "seagreen1" "seagreen2"
## [577] "seagreen3" "seagreen4" "seashell"
## [580] "seashell1" "seashell2" "seashell3"
## [583] "seashell4" "sienna" "sienna1"
## [586] "sienna2" "sienna3" "sienna4"
## [589] "skyblue" "skyblue1" "skyblue2"
## [592] "skyblue3" "skyblue4" "slateblue"
## [595] "slateblue1" "slateblue2" "slateblue3"
## [598] "slateblue4" "slategray" "slategray1"
## [601] "slategray2" "slategray3" "slategray4"
## [604] "slategrey" "snow" "snow1"
## [607] "snow2" "snow3" "snow4"
## [610] "springgreen" "springgreen1" "springgreen2"
## [613] "springgreen3" "springgreen4" "steelblue"
## [616] "steelblue1" "steelblue2" "steelblue3"
## [619] "steelblue4" "tan" "tan1"
## [622] "tan2" "tan3" "tan4"
## [625] "thistle" "thistle1" "thistle2"
## [628] "thistle3" "thistle4" "tomato"
## [631] "tomato1" "tomato2" "tomato3"
## [634] "tomato4" "turquoise" "turquoise1"
## [637] "turquoise2" "turquoise3" "turquoise4"
## [640] "violet" "violetred" "violetred1"
## [643] "violetred2" "violetred3" "violetred4"
## [646] "wheat" "wheat1" "wheat2"
## [649] "wheat3" "wheat4" "whitesmoke"
## [652] "yellow" "yellow1" "yellow2"
## [655] "yellow3" "yellow4" "yellowgreen"
There are a whopping 657 colors in R that you can use.
That red is kind of jarring. Let’s make it something like a sea green. We’ll need a lot of happy to get over all that sad.
plot(mexbin, main = "Drug-Related Homicides in Mexico",
xlab = "Homicides Per Month",
ylab = "Number of Months",
col = "seagreen1")
Or maybe we want to graph the homicides over time.
plot(mexico$Mexico,type="l",lwd=3,
xlab="Months: January 2008 - November 2012",
ylab="Drug-Related Homicides Per Month",
main="Reforma Ejecutometro")
lines(mexico$Heroin,lty="dotted",col="blue",lwd=3)
lines(mexico$GoldenTriangle,lty="dashed",col="red",lwd=3)
legend("topleft",pch=19,col=c("black","blue","red")
,legend=c("Mexico","Heroin States","Golden Triangle"))
Ooh, let’s make a boxplot to see how the homicides are distributed across the months.
boxplot(mexico$Mexico ~ mexico$Month)
That’s fine, but it’s not pretty.
The independent varialbe is a factor, and a character string, so R is ordering them alphabetically.
We need months in the order of, you know, months.
We can do it, but we need to sort of trick R into doing it.
Ready? Hold on to your butts.
monthlevel<-vector()
length(monthlevel)<-length(mexico$Month)
monthlevel[mexico$Month=="January"]<-1
monthlevel[mexico$Month=="February"]<-2
monthlevel[mexico$Month=="March"]<-3
monthlevel[mexico$Month=="April"]<-4
monthlevel[mexico$Month=="May"]<-5
monthlevel[mexico$Month=="June"]<-6
monthlevel[mexico$Month=="July"]<-7
monthlevel[mexico$Month=="August"]<-8
monthlevel[mexico$Month=="September"]<-9
monthlevel[mexico$Month=="October"]<-10
monthlevel[mexico$Month=="November"]<-11
monthlevel[mexico$Month=="December"]<-12
monthlevel<-as.factor(monthlevel)
levels(monthlevel)<-c("Jan","Feb","Mar","Apr","May","Jun","Jul","Aug","Sep","Oct","Nov","Dec")
boxplot(totalmex~monthlevel,
main="Distribution of Homicides by Month of the Year",
xlab="Month",
ylab="Homicides",
col=rep(c("Yellow","Red","Green","Blue"),times=3))